Performance Evaluation of Mutation / Non-Mutation Based Classification With Missing Data

نویسنده

  • N. C. Vinod
چکیده

A common problem encountered by many data mining techniques is the missing data. A missing data is defined as an attribute or feature in a dataset which has no associated data value. Correct treatment of these data is crucial, as they have a negative impact on the interpretation and result of data mining processes. Missing value handling techniques can be grouped into four categories, namely, complete case analysis, Imputation methods, maximum likelihood methods and machine learning methods. Out of these imputation methods are the widely used solution for handling missing values. However, there are situations when imputation methods might not work correctly. This study studies and analyzes the performance of two algorithms, one imputation based and another without imputation based classification on missing data. Keywords-Missing Values, Imputation, Non-imputation, Classification with missing data.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

به کارگیری روش‌های خوشه‌بندی در ریزآرایه DNA

Background: Microarray DNA technology has paved the way for investigators to expressed thousands of genes in a short time. Analysis of this big amount of raw data includes normalization, clustering and classification. The present study surveys the application of clustering technique in microarray DNA analysis. Materials and methods: We analyzed data of Van’t Veer et al study dealing with BRCA1...

متن کامل

Evaluation of JAK2V617F mutation prevalence in myeloproliferative neoplasm by AS-RT-PCR

Abstract Objective JAK2 is a non-receptor tyrosine kinase that plays a major role in myeloid disorders. JAK2V617F mutation is characterized by a G to T transverse at nucleotide 1849 in exon 12 of the JAK2 gene, located on the chromosome 9p, leading to a substitution of valine to phenylalanine at amino acid position 617 in the JAK2 protein. Methods In this study we evaluated RNA from 89 pati...

متن کامل

Dimensionality Reduction and Improving the Performance of Automatic Modulation Classification using Genetic Programming (RESEARCH NOTE)

This paper shows how we can make advantage of using genetic programming in selection of suitable features for automatic modulation recognition. Automatic modulation recognition is one of the essential components of modern receivers. In this regard, selection of suitable features may significantly affect the performance of the process. Simulations were conducted with 5db and 10db SNRs. Test and ...

متن کامل

DNA Sequence Fragment Containing C to A Mutation as a Convenient Mutation Standard for DHPLC Analysis

Objective(s):  Denaturing high performance liquid chromatography (DHPLC) is a high throughput approach for screening DNA sequence variations. To assess oven calibration, cartridge performance, buffer composition and stability, the WAVE Low and High Range Mutation Standards are employed to ensure reproducibility and accuracy of the chromatographic analysis. The purpose of this study was to provi...

متن کامل

بررسی جهش ژن BTNL2 در بیماران مبتلا به سارکوئیدوز پوستی

Background and Aim: Sarcoidosis is a non-caseous granulomatous disease that can involve several organs such as lung, kidney, liver, heart and skin. In systemic sarcoidosis, skin lesions occur in 20-35% of patients. Cutaneous sarcoidosis with no systemic involvement was found in about 25% of patients. Mutation within Butyrophilin-like 2 (BTNL2) gene, rs2076530 was reported in systemic sarcoidosi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013